Search CORE

11 research outputs found

Convolutional Neural Fabrics

Author: Saxena Shreyas
Verbeek Jakob
Publication venue
Publication date: 04/12/2016
Field of study

Despite the success of CNNs, selecting the optimal architecture for a given task remains an open problem. Instead of aiming to select a single optimal architecture, we propose a "fabric" that embeds an exponentially large number of architectures. The fabric consists of a 3D trellis that connects response maps at different layers, scales, and channels with a sparse homogeneous local connectivity pattern. The only hyper-parameters of a fabric are the number of channels and layers. While individual architectures can be recovered as paths, the fabric can in addition ensemble all embedded architectures together, sharing their weights where their paths overlap. Parameters can be learned using standard methods based on back-propagation, at a cost that scales linearly in the fabric size. We present benchmark results competitive with the state of the art for image classification on MNIST and CIFAR10, and for semantic segmentation on the Part Labels dataset.Comment: Corrected typos (In proceedings of NIPS16

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Coordinated Local Metric Learning

Author: Saxena Shreyas
Verbeek Jakob
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/12/2015
Field of study

International audienceMahalanobis metric learning amounts to learning a linear data projection, after which the L2 metric is used to compute distances. To allow more flexible metrics, not restricted to linear projections, local metric learning techniques have been developed. Most of these methods partition the data space using clustering, and for each cluster a separate metric is learned. Using local metrics, however, it is not clear how to measure distances between data points assigned to different clusters. In this paper we propose to embed the local metrics in a global low-dimensional representation, in which the L2 metric can be used. With each cluster we associate a linear mapping that projects the data to the global representation. This global representation directly allows computing distances between points regardless to which local cluster they belong. Moreover, it also enables data visualization in a single view, and the use of L2 based efficient retrieval methods. Experiments on the Labeled Faces in the Wild dataset show that our approach improves over previous global and local metric learning approaches

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models

Author: DeCoste Dennis
Gupta Abhay
Leong Kevin
Li Tianda
Lie Sean
Marshall William
Saxena Shreyas
Thangarasa Vithursan
Publication venue
Publication date: 29/07/2023
Field of study

The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural Language Processing (NLP). Instead of directly training on a downstream task, language models are first pre-trained on large datasets with cross-domain knowledge (e.g., Pile, MassiveText, etc.) and then fine-tuned on task-specific data (e.g., natural language generation, text summarization, etc.). Scaling the model and dataset size has helped improve the performance of LLMs, but unfortunately, this also lead to highly prohibitive computational costs. Pre-training LLMs often require orders of magnitude more FLOPs than fine-tuning and the model capacity often remains the same between the two phases. To achieve training efficiency w.r.t training FLOPs, we propose to decouple the model capacity between the two phases and introduce Sparse Pre-training and Dense Fine-tuning (SPDF). In this work, we show the benefits of using unstructured weight sparsity to train only a subset of weights during pre-training (Sparse Pre-training) and then recover the representational capacity by allowing the zeroed weights to learn (Dense Fine-tuning). We demonstrate that we can induce up to 75% sparsity into a 1.3B parameter GPT-3 XL model resulting in a 2.5x reduction in pre-training FLOPs, without a significant loss in accuracy on the downstream tasks relative to the dense baseline. By rigorously evaluating multiple downstream tasks, we also establish a relationship between sparsity, task complexity and dataset size. Our work presents a promising direction to train large GPT models at a fraction of the training FLOPs using weight sparsity, while retaining the benefits of pre-trained textual representations for downstream tasks.Comment: Accepted to Uncertainty in Artificial Intelligence (UAI) 2023 Conference; 13 pages, 4 figures (Main Paper) + 5 pages (Supplementary Material

arXiv.org e-Print Archive

A Prospective Multicenter Study Evaluating Learning Curves and Competence in Endoscopic Ultrasound and Endoscopic Retrograde Cholangiopancreatography Among Advanced Endoscopy Trainees: The Rapid Assessment of Trainee Endoscopy Skills (RATES) Study

Author: Ali Meer Akbar
Brauer Brian
Carlin Linda
Chak Amitabh
Collins Dan
Cote Gregory A.
Diehl David L.
DiMaio Christopher J.
Dries Andrew
Early Dayna S.
Edmundowicz Steven
El-Hajj Ihab
Ellert Swan
Fairley Kimberley
Faulx Ashley
Fujii-Lau Larissa
Gaddam Srinivas
Gan Seng-Ian
Gaspar Jonathan P.
Gautamy Chitiki
Gordon Stuart
Hall Matt
Han Samuel
Harris Cynthia
Hyder Sarah
Jones Ross
Keswani Rajesh
Kim Stephen
Komanduri Srinadh
Law Ryan
Lee Linda
Mounzer Rawad
Mullady Daniel
Muthusamy V. Raman
Olyaee Mojtaba
Pfau Patrick
Piraka Cyrus
Rastogi Amit
Rosenkranz Laura
Rzouq Fadi
Saligram Shreyas
Saxena Aditi
Shah Raj J.
Simon Violette C.
Small Aaron
Sreenarasimhaiah Jayaprakash
Walker Andrew
Wang Andrew Y.
Wani Sachin
Watson Rabindra R.
Wilson Robert H.
Yachimski Patrick
Yang Dennis
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Background and aims Based on the Next Accreditation System, trainee assessment should occur on a continuous basis with individualized feedback. We aimed to validate endoscopic ultrasound (EUS) and endoscopic retrograde cholangiopancreatography (ERCP) learning curves among advanced endoscopy trainees (AETs) using a large national sample of training programs and to develop a centralized database that allows assessment of performance in relation to peers. Methods ASGE recognized training programs were invited to participate and AETs were graded on ERCP and EUS exams using a validated competency assessment tool that assesses technical and cognitive competence in a continuous fashion. Grading for each skill was done using a 4-point scoring system and a comprehensive data collection and reporting system was built to create learning curves using cumulative sum analysis. Individual results and benchmarking to peers were shared with AETs and trainers quarterly. Results Of the 62 programs invited, 20 programs and 22 AETs participated in this study. At the end of training, median number of EUS and ERCP performed/AET was 300 (range 155-650) and 350 (125-500). Overall, 3786 exams were graded (EUS:1137; ERCP–biliary 2280, pancreatic 369). Learning curves for individual endpoints, and overall technical/cognitive aspects in EUS and ERCP demonstrated substantial variability and were successfully shared with all programs. The majority of trainees achieved overall technical (EUS: 82%; ERCP: 60%) and cognitive (EUS: 76%; ERCP: 100%) competence at conclusion of training. Conclusions These results demonstrate the feasibility of establishing a centralized database to report individualized learning curves and confirm the substantial variability in time to achieve competence among AETs in EUS and ERCP

IUPUIScholarWorks

Henry Ford Health System Scholarly Commons

Apprentissage de représentations pour la reconnaissance visuelle

Author: Saxena Shreyas
Publication venue: HAL CCSD
Publication date: 12/12/2016
Field of study

In this dissertation, we propose methods and data driven machine learning solutions which address and benefit from the recent overwhelming growth of digital media content.First, we consider the problem of improving the efficiency of image retrieval. We propose a coordinated local metric learning (CLML) approach which learns local Mahalanobis metrics, and integrates them in a global representation where the l2 distance can be used. This allows for data visualization in a single view, and use of efficient ` 2 -based retrieval methods. Our approach can be interpreted as learning a linear projection on top of an explicit high-dimensional embedding of a kernel. This interpretation allows for the use of existing frameworks for Mahalanobis metric learning for learning local metrics in a coordinated manner. Our experiments show that CLML improves over previous global and local metric learning approaches for the task of face retrieval.Second, we present an approach to leverage the success of CNN models forvisible spectrum face recognition to improve heterogeneous face recognition, e.g., recognition of near-infrared images from visible spectrum training images. We explore different metric learning strategies over features from the intermediate layers of the networks, to reduce the discrepancies between the different modalities. In our experiments we found that the depth of the optimal features for a given modality, is positively correlated with the domain shift between the source domain (CNN training data) and the target domain. Experimental results show the that we can use CNNs trained on visible spectrum images to obtain results that improve over the state-of-the art for heterogeneous face recognition with near-infrared images and sketches.Third, we present convolutional neural fabrics for exploring the discrete andexponentially large CNN architecture space in an efficient and systematic manner. Instead of aiming to select a single optimal architecture, we propose a “fabric” that embeds an exponentially large number of architectures. The fabric consists of a 3D trellis that connects response maps at different layers, scales, and channels with a sparse homogeneous local connectivity pattern. The only hyperparameters of the fabric (the number of channels and layers) are not critical for performance. The acyclic nature of the fabric allows us to use backpropagation for learning. Learning can thus efficiently configure the fabric to implement each one of exponentially many architectures and, more generally, ensembles of all of them. While scaling linearly in terms of computation and memory requirements, the fabric leverages exponentially many chain-structured architectures in parallel by massively sharing weights between them. We present benchmark results competitive with the state of the art for image classification on MNIST and CIFAR10, and for semantic segmentation on the Part Labels datasetDans cette dissertation, nous proposons des méthodes d’apprentissage automa-tique aptes à bénéficier de la récente explosion des volumes de données digitales.Premièrement nous considérons l’amélioration de l’efficacité des méthodes derécupération d’image. Nous proposons une approche d’apprentissage de métriques locales coordonnées (Coordinated Local Metric Learning, CLML) qui apprends des métriques locales de Mahalanobis, puis les intègre dans une représentation globale où la distance l2 peut être utilisée. Ceci permet de visualiser les données avec une unique représentation 2D, et l’utilisation de méthodes de récupération efficaces basées sur la distance l2. Notre approche peut être interprétée comme l’apprentissage d’une projection linéaire de descripteurs donnés par une méthode a noyaux de grande dimension définie explictement. Cette interprétation permet d’appliquer des outils existants pour l’apprentissage de métriques de Mahalanobis à l’apprentissage de métriques locales coordonnées. Nos expériences montrent que la CLML amé-liore les résultats en matière de récupération de visage obtenues par les approches classiques d’apprentissage de métriques locales et globales.Deuxièmement, nous présentons une approche exploitant les modèles de ré-seaux neuronaux convolutionnels (CNN) pour la reconnaissance faciale dans lespectre visible. L’objectif est l’amélioration de la reconnaissance faciale hétérogène, c’est à dire la reconnaissance faciale à partir d’images infra-rouges avec des images d’entraînement dans le spectre visible. Nous explorerons différentes stratégies d’apprentissage de métriques locales à partir des couches intermédiaires d’un CNN, afin de faire le rapprochement entre des images de sources différentes. Dans nos expériences, la profondeur de la couche optimale pour une tâche donnée est positivement corrélée avec le changement entre le domaine source (données d’entraînement du CNN) et le domaine cible. Les résultats montrent que nous pouvons utiliser des CNN entraînés sur des images du spectre visible pour obtenir des résultats meilleurs que l’état de l’art pour la reconnaissance faciale hétérogène (images et dessins quasi-infrarouges).Troisièmement, nous présentons les "tissus de neurones convolutionnels" (Convolutional Neural Fabrics) permettant l’exploration de l’espace discret et exponentiellement large des architectures possibles de réseaux neuronaux, de manière efficiente et systématique. Au lieu de chercher à sélectionner une seule architecture optimale, nous proposons d’utiliser un "tissu" d’architectures combinant un nombre exponentiel d’architectures en une seule. Le tissu est une représentation 3D connectant les sorties de CNNs à différentes couches, échelles et canaux avec un motif de connectivité locale, homogène et creux. Les seuls hyper-paramètres du tissu (le nombre de canaux et de couches) ne sont pas critiques pour la performance. La nature acyclique du tissu nous permet d’utiliser la rétro-propagation du gradient durant la phase d’apprentissage. De manière automatique, nous pouvons donc configurer le tissu de manière à implémenter l’ensemble de toutes les architectures possibles (un nombre exponentiel) et, plus généralement, des ensembles (combinaisons) de ces modèles. La complexité de calcul et de taille mémoire du tissu évoluent de manière linéaire alors qu’il permet d’exploiter un nombre exponentiel d’architectures en parallèle, en partageant les paramètres entre architectures. Nous présentons des résultats à l’état de l’art pour la classification d’images sur le jeu de données MNIST et CIFAR10, et pour la segmentation sémantique sur le jeu de données Part Labels

Thèses en Ligne

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Learning representations for visual recognition

Author: Saxena Shreyas
Publication venue
Publication date: 12/12/2016
Field of study

Dans cette dissertation, nous proposons des méthodes d’apprentissage automa-tique aptes à bénéficier de la récente explosion des volumes de données digitales.Premièrement nous considérons l’amélioration de l’efficacité des méthodes derécupération d’image. Nous proposons une approche d’apprentissage de métriques locales coordonnées (Coordinated Local Metric Learning, CLML) qui apprends des métriques locales de Mahalanobis, puis les intègre dans une représentation globale où la distance l2 peut être utilisée. Ceci permet de visualiser les données avec une unique représentation 2D, et l’utilisation de méthodes de récupération efficaces basées sur la distance l2. Notre approche peut être interprétée comme l’apprentissage d’une projection linéaire de descripteurs donnés par une méthode a noyaux de grande dimension définie explictement. Cette interprétation permet d’appliquer des outils existants pour l’apprentissage de métriques de Mahalanobis à l’apprentissage de métriques locales coordonnées. Nos expériences montrent que la CLML amé-liore les résultats en matière de récupération de visage obtenues par les approches classiques d’apprentissage de métriques locales et globales.Deuxièmement, nous présentons une approche exploitant les modèles de ré-seaux neuronaux convolutionnels (CNN) pour la reconnaissance faciale dans lespectre visible. L’objectif est l’amélioration de la reconnaissance faciale hétérogène, c’est à dire la reconnaissance faciale à partir d’images infra-rouges avec des images d’entraînement dans le spectre visible. Nous explorerons différentes stratégies d’apprentissage de métriques locales à partir des couches intermédiaires d’un CNN, afin de faire le rapprochement entre des images de sources différentes. Dans nos expériences, la profondeur de la couche optimale pour une tâche donnée est positivement corrélée avec le changement entre le domaine source (données d’entraînement du CNN) et le domaine cible. Les résultats montrent que nous pouvons utiliser des CNN entraînés sur des images du spectre visible pour obtenir des résultats meilleurs que l’état de l’art pour la reconnaissance faciale hétérogène (images et dessins quasi-infrarouges).Troisièmement, nous présentons les "tissus de neurones convolutionnels" (Convolutional Neural Fabrics) permettant l’exploration de l’espace discret et exponentiellement large des architectures possibles de réseaux neuronaux, de manière efficiente et systématique. Au lieu de chercher à sélectionner une seule architecture optimale, nous proposons d’utiliser un "tissu" d’architectures combinant un nombre exponentiel d’architectures en une seule. Le tissu est une représentation 3D connectant les sorties de CNNs à différentes couches, échelles et canaux avec un motif de connectivité locale, homogène et creux. Les seuls hyper-paramètres du tissu (le nombre de canaux et de couches) ne sont pas critiques pour la performance. La nature acyclique du tissu nous permet d’utiliser la rétro-propagation du gradient durant la phase d’apprentissage. De manière automatique, nous pouvons donc configurer le tissu de manière à implémenter l’ensemble de toutes les architectures possibles (un nombre exponentiel) et, plus généralement, des ensembles (combinaisons) de ces modèles. La complexité de calcul et de taille mémoire du tissu évoluent de manière linéaire alors qu’il permet d’exploiter un nombre exponentiel d’architectures en parallèle, en partageant les paramètres entre architectures. Nous présentons des résultats à l’état de l’art pour la classification d’images sur le jeu de données MNIST et CIFAR10, et pour la segmentation sémantique sur le jeu de données Part Labels.In this dissertation, we propose methods and data driven machine learning solutions which address and benefit from the recent overwhelming growth of digital media content.First, we consider the problem of improving the efficiency of image retrieval. We propose a coordinated local metric learning (CLML) approach which learns local Mahalanobis metrics, and integrates them in a global representation where the l2 distance can be used. This allows for data visualization in a single view, and use of efficient ` 2 -based retrieval methods. Our approach can be interpreted as learning a linear projection on top of an explicit high-dimensional embedding of a kernel. This interpretation allows for the use of existing frameworks for Mahalanobis metric learning for learning local metrics in a coordinated manner. Our experiments show that CLML improves over previous global and local metric learning approaches for the task of face retrieval.Second, we present an approach to leverage the success of CNN models forvisible spectrum face recognition to improve heterogeneous face recognition, e.g., recognition of near-infrared images from visible spectrum training images. We explore different metric learning strategies over features from the intermediate layers of the networks, to reduce the discrepancies between the different modalities. In our experiments we found that the depth of the optimal features for a given modality, is positively correlated with the domain shift between the source domain (CNN training data) and the target domain. Experimental results show the that we can use CNNs trained on visible spectrum images to obtain results that improve over the state-of-the art for heterogeneous face recognition with near-infrared images and sketches.Third, we present convolutional neural fabrics for exploring the discrete andexponentially large CNN architecture space in an efficient and systematic manner. Instead of aiming to select a single optimal architecture, we propose a “fabric” that embeds an exponentially large number of architectures. The fabric consists of a 3D trellis that connects response maps at different layers, scales, and channels with a sparse homogeneous local connectivity pattern. The only hyperparameters of the fabric (the number of channels and layers) are not critical for performance. The acyclic nature of the fabric allows us to use backpropagation for learning. Learning can thus efficiently configure the fabric to implement each one of exponentially many architectures and, more generally, ensembles of all of them. While scaling linearly in terms of computation and memory requirements, the fabric leverages exponentially many chain-structured architectures in parallel by massively sharing weights between them. We present benchmark results competitive with the state of the art for image classification on MNIST and CIFAR10, and for semantic segmentation on the Part Labels datase

Theses.fr

Utility of point-of-care ultrasound in differentiating causes of shock in resource-limited setup

Author: Chawada Bansari
H Humbal Rahulkumar
H Pancholi Krunalkumar
K Patel Shreyas
Parikh Rina Bhavin
Saxena Atulkumar
Publication venue: 'Medknow'
Publication date: 01/01/2019
Field of study

Background: Delivering early diagnosis of shock in resource-limited setting is challenging, especially with limited availability of point-of-care laboratory and radiological diagnostic facilities. There is growing urgency to provide point-of-care diagnosis and treatment for time-sensitive condition like shock. Aims: We tried to evaluate the application of point-of-care ultrasound (Rapid Ultrasound for Shock and Hypertension [RUSH] protocol) considering different disease cohort and practice realities in our setup. Settings and Design: This study was a single-center prospective diagnostic study to check the diagnostic accuracy of point-of-care ultrasound (RUSH protocol). This study was approved by the ethics committee. Materials and Methods: The study was conducted at the emergency medicine department of a tertiary care government hospital in Central Gujarat from November 16 to October 17. All adult patients with clinical features of shock with systolic blood pressure 1 presenting to emergency department were included as participants. The results of point-of-care ultrasound (RUSH protocol) were compared with the diagnosis given by consultants of respective department as per standard departmental practices. Statistical Analysis and Results: A total of 130 patients were enrolled in this study. Mean time taken to examine by the point-of-care Ultrasound (RUSH protocol) was 12 min (range 11–14 min). Kappa index was 0.860. This protocol was able to correctly diagnose 100% of obstructive shock, 96.3% of cardiogenic shock, 94.4% of hypovolemic shock, 80.9% of mixed type of shock, and 75% of distributive type of shock. Conclusion: This study highlights the role of point-of-care ultrasound (RUSH protocol) for early diagnosis of the shock etiology in emergency medicine department. Diagnosis using point-of-care ultrasound (RUSH protocol) significantly agreed with medical diagnosis. It showed good efficacy of point-of-care ultrasound (RUSH protocol) to differentiate causes of shock with good accuracy except distributive shock

Directory of Open Access Journals